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High-Speed  Recursive  Digital   Filter  Realization 


ABSTRACT 

Pipeline  techniques  have  been  successfully  applied  to  speeding  up  proc- 
essing in  both  general  and  special  purpose  digital  computers.  Application  of 
these  techniques  to  non-recursive  (FIR)  filters  has  been  suggested  and  is  quite 
straightforward.  Application  to  recursive  ( I IR )  filters  has  not  previously 
been  shown.  In  this  paper,  the  technique  for  applying  pipeline  techniques  to 
recursive  filters  is  shown  and  the  advantages  and  disadvantages  of  the  tech- 
nique are  discussed.  Using  these  techniques,  recursive  digital  filters  oper- 
ating at  hitherto   impossibly  high  rates  can  be  designed. 


INTRODUCTION 

Digital  filters  have  an  important  place  in  the  technology  of  processing 
signals.  Because  of  the  sample  theorem,  the  sampling  rate  which  is  also  the 
clock  rate  of  the  filter,  must  be  higher  than  twice  the  highest  frequency 
component  in  the  signal  to  be  filtered.  Thus  the  use  of  digital  filters  in 
real   time  application   involving  high  frequencies  demands  high  sample  rate. 

Pipeline  techniques  are  well  known  [1,2]  ways  to  increase  the  clock  rate 
(throughput)  of  a  particular  state-of-the-art  realization  of  a  logic  module. 
For  example,  the  Cray  I  supercomputer  has  a  clock  rate  of  80  Mhz  and  requires  a 
7  stage  pipeline  multiplier  to  perform  64-bit  floating  point  multiplication  at 
the  rate  of  one  product  every  12.5  ns.   [3], 

Other  workers  have  investigated  residue  number  techniques  to  obtain  high 
rate  digital  filters  [4].  However,  difficulties  in  implementing  division  by  a 
constant  impose  limitations  on  such  realizations,  making  scaling  awkward  and 
precluding  anything  resembling  floating  point. 

Pipeline  techniques  can  thus  be  used  to  produce  multipliers  and  adders 
that  operate  at  the  maximum  clock  rate  possible  for  a  given  state  of  the  art  of 
logic  devices.  For  example,  TRW  makes  a  non-pipeline  16  bit  integer  multiply 
chip  with  input  and  output  registers  that  can  be  operated  at  the  rate  of  about 
9  MHz.  Pipeline  techniques  could  be  applied  to  that  design,  producing  a  new 
chip  that  might  have  three  stages  and  operate  at  near  40  MHz. 

Discrete  logic  realization  of  the  basic  functions  of  multiplication  and 
division  can  yield  even  better  performance.  At  the  extreme  we  have  some  of  the 
integrated  circuit  ECL  lines.  Based  on  the  experience  gained  with  the  pipeline 
supercomputers,  it  is  reasonable  to  expect  that  a  three  stage  12-bit  integer 
multiply  and  1  to  2  stage  16-bit  integer  adder  could  be  built  to  operate  at 
around   80  MHz.      Floating   point    operation   could    be   obtained    at    the    same    rate 


with  a  cost  of  perhaps  only  2  additional  stages  for  the  multiplier  and  3  more 
for  the  adder. 

What  finally  limits  the  clock  rate  in  a  pipeline  system  is  the  net  delay 
of  all  logic  and  the  register  delay  between  successive  registers,  however 
pipeline  techniques  will  yield  the  highest  throughput  or  clock  rate  possible 
with  a  given  state  of  the  art. 

Now  it  should  be  clear  why  we  should  consider  the  application  of  pipeline 
techniques. 

1)  higher  clock  (sample)  rates  are  possible  for  a  given  state  of  the 
art. 

2)  more  complex  operations  (i.e.  floating  point)  are  possible  for  a  given 
state  of  the  art  and  clock  rate. 

Pipeline  techniques  can  be  easily  applied  to  non-recursive  systems,  that 
is,  systems  without  feedback.  In  the  digital  filter  arena,  this  would 
correspond  to  non-recursive  or  Finite  Impulse  Response  (FIR)  filters.  These 
techniques  are  straight  forward  and  proceed  as  follows: 

An  FIR  filter  can  be  described  by  a  transfer  function  in  the  delay 
operator  (z~l  or  D)  as   in  (1) 

1  =  HU"1)  =  aQ  +  a1  z'1  +...+  amz'm  (1) 

Figure  1  shows  the  realization  of  such  a  filter  using  delayless  add  and 
multiply  modules  as  well  as  unit  delays.  That  realization  is  unrealistic  and 
does  not  capitalize  on  the  advantages  of  pipeline  processors  as  discussed 
above. 

Let  us  suppose  that  we  have  a  1-stage  pipel  ine  adder  and  a  3-stage 
pi  pel ine  mul tipl ier. 


Figure  2  shows  a  pipeline  realization  of  the  transfer  function  given  in 
(1).  In  fact,  (1)  is  not  exactly  realized  because  of  the  delay  of  the  adders. 
Instead  the  output  of  the  filter  is  y  delayed  by  7  clock  periods.  This, 
however,  is  not  a  serious  difficulty. 

Thus,  pipeline  realization  of  FIR  filters  is  straight  forward,  costing 
only  an  added  delay  in  the  resulting  output  sequence  rate,  and  perhaps  some 
additional  adders  over  the  realization  of  figure  1,  if  the  adder  has  more  than 
one  stage. 

What  then  about  doing  this  for  I IR  filters?  Applying  pipeline  techniques 
to  recursive  calculations  is  not  so  straightforward  as  it  is  in  the  non- 
recursive  case,  and  was  first  reported  in  [1]  in  connection  with  accumulator 
design. 

In  the  next  section  we  shall  consider  the  design  of  recursive  ( I  IR ) 
filters.  Following  that,  we  discuss  the  general  attributes  of  the  solution  and 
some  of  the  unsolved  problems  associated  with  the  technique.  Finally,  an 
example  of  a  sixth  order  Butterworth  filter  is  treated,  to  illustrate  the 
technique. 


PIPELINE  RECURSIVE  DESIGN 
An  n^  order  recursive  or  Infinite  Impulse  Response  (IIR)  filter  can 
be  characterized  by  a  ratio  of  two  nth  degree  polynomials  in  z~*,  the  delay 
operator,  as  shown  in  (2) 

an  +  a,z"  +  ...  +  amz~m  ./  » 

v  _      °    j m „  -       AU)  (?) 

y"~i h  .-1  h  7* h  ,"n        B(z)  X  (Z) 

1  -  b,z   -b0z   -  . . .  -  b  z 
1     c  n 

This  representation  can  be  expressed  using  D,  another  notation  for  the 
delay  operator,  as  shown  in  (3).  Time  is  measured  in  integer  steps  by  the 
index   i. 

AfDv  (aQ  +  a,D  +  ...  +  a  Dm)   x(i) 

y(D  =X|x(i)  =— 5 1 ?L— (3) 

ti[U>  (1  -  b,D  -  ...   -  b  Dn 

l  n 

where 

Dkx(i)   =  x(i-k) 

Rearranging  (3)  yields  the  familiar  recursive  formulation  of  the  filter 
equation,  as  shown   in   (4) 

y(i)  =  (aQ  +  9lD  +  ...  +  amDm)   x(i)  +   (t^D  +  ...  +  bnDn)  y(i)         (4) 

Equation  (4)  leads  directly  to  the  canonical  representation  of  the  filter  as 
shown  in  figure  3,  assuming  the  availability  of  a  multiply-add  unit  with  unit 
delay  (one  clock  pulse  period  delay).  This  realization  actually  does  not  yield 
y,  but  in  fact  produces  Dy,  y  delayed  by  one  clock  pulse  period.  The  canonical 
form  shown  in  this  figure  can  be  realized,  but  unfortunately  because  the 
multiply-add  unit  is  very  complex,  the  realization  will  be  quite  slow.  If  we 
assume  12  bits  of  significance  to  the  representation  of  the  numbers  in  the 
filter,    we    find    that    the    unit    has    two    1-input,    12-bit    constant    multipliers 


feeding  a  three- input  by  12-bit  adder.  All  of  this  complexity  forces  the  clock 
pulse  period  to  be  very  long,  and  the  clock  or  sampling  rate  as  a  consequence 
to  be  low.  A  typical  value  or  clock  rate  using  state-of-the-art  Schottkey 
integrated  circuit  PROMs  and  adders  would  be  on  the  order  of  10  MHz.  A  slight 
improvement  in  the  rate  of  operation  is  possible  if  we  use  the  basic  module 
shown  in  figure  4a.  A  delay  of  1  clock  pulse  period  is  produced  by  the  output 
register.  This  module  could  be  built  to  operate  at  a  clock  rate  of  perhaps  12 
MHz,  compared  with  the  realization  of  figure  3. 

What  is  desired  is  some  means  to  take  advantage  of  higher  clock  rate  pipe- 
line realization  of  the  basic  multiply-add  module,  thus  producing  a  higher  sam- 
pling rate  realization  of  the  recursive  digital   filter. 

In  order  to  increase  the  clock  rate  of  the  multiply-add  unit,  we  first  use 
the  two  input  version,  and  then  insert  several  registers  for  intermediate  re- 
sults in  the  coefficient  multiply  and  adder  sections.  For  example,  a  ?-stage 
version  is  shown  in  Figure  4b  that  could  be  made  to  operate  at  16  MHz.  The  net 
result  is  a  km  +  ka-stage  (register)  pipeline  multiply-add  unit  as  shown  in 
figure  5,  with  km  stages  involved  in  the  coefficient  multiply  and  ka  stages 
in  the  adder. 

When  we  attempt  to  use  the  module  of  figures  4  or  5,  we  find  it  impossible 
to  insert  it  in  the  feedback  loop,  preserving  a  minimum  delay  of  only  one.  So, 
if  we  are  to  use  the  pipeline  module,  some  other  way  must  be  found. 

For  the  purposes  of  the  derivation  which  follows,  we  will  restrict  our 
consideration  to  second-order  filters,  that  is  m=2,  n=2.  This  in  no  way  limits 
the  result,  for  the  process  used  is  not  dependent  on  n,  except  for  the  value  of 
the  individual  coefficients.  On  the  other  hand,  higher  order  filters  can  also 
be  realized  as  the  cascade  of  [n/2]  2nd  order  filters,  and  at  most  one 
filter  of  lower  order. 


Now  we  let  n=2  and  consider  that  specific  case  of 

Y       A(2~1)  a0  +  alz_1  +  a2z"2  ,K, 

A       B(z  l)  1  -  bxz  X   -  b2z  * 

or     represented     in     terms     of     the     delay     operator     D,     the     usual     computer 
representation  of  delay. 

,2 


(6) 


x(i)         a0  +  alD  +  a2D 
^"     1   -  bjD  -  b2D^ 

Writing  this  in  a  different  form  we  have 

y(i)  =  (aQ  +  axD  +  a2D2)x(i)  +  (b^  +  b2D2)y(i)  (7) 

which  can  also  be  written  in  difference  equation  form: 

*1  =  Vi  +  alxi-l  +  a2xi-2  +  <bl*1-l  +  Vi-2>  (8) 

For  subsequent  development  we  will  use  (7);  furthermore,  we  will  omit  the  (i) 
notation.  We  will  relate  that  form  to  the  other  forms  where  appropriate. 
Let  us  first  ignore  the  A(D)  polynomial  by  substituting 

x'  =  A(D)x  (9) 

into  (7)  yielding 

y  =  x'   +  (bjDy  +  b2D2y)  (10) 

Delaying  (10)  and  substituting  for  Oy  in  (9)  we  obtain 
y  =  x'  +  b^Dx'  +  b1D2y  +  b2D3y)  +  b2D2y 


or 

y  =   (1  +  b1D)x'  +     (b\  +  b2)D2y  +  bjb^y  (11) 

(11)  is  now  a  third  order  difference  equation  describing  the  same  digital  fil- 
ter. That  is  to  say,  the  filter  described  by  (11)  has  the  same  transfer  char- 
acteristic as  the  one  described  by  (7)  or  (10).  Delaying  (10)  by  \r  and 
substituting   into  (11)  yields 

y  =  [1  +  bxD  +  (b2  +  b2)D2]x' 

+   (2btb2  +  b^)D3y  +   (b2  +  h2b2)D4y  (12) 

In  general,  successively  more  delayed  versions  of  (10)  can  be  substituted, 
raising  the  order  of  the  difference  equation,  but,  more  importantly,  raising 
also  the  minimum  delay  associated  with  the  feedback  y  terms.  The  general 
higher  order  difference  equation  has  the  following  general  form,  provided  we 
started  from  a  second-order  equation: 


y  =  [1  +  o<p)  D  +  4P)  D2  ♦  ...  +  >'  t)P]x' 


+     bill°P+ly  *  »$DP+2y  (13) 

In    fact,    if   we    began    with    an    n       order    difference    equation,    we    would    have 
an  equation  of  p+n  order  resulting  from  the  foregoing  process  as  shown   in  (14) 

y  =  [1  +  ajp)  D  +  ...  +  Jp)  D  ]x' 

♦     b(P>0P+1y  ♦  ...   ♦  b<P>DP+ny  (14) 

The  coefficients  a\p'   and  b\?'     can  be  calculated  by  following  the  process 
described  above  in  detail.     For  example,   it  should  be  clear  from  (12)  that 


(2)         k  (2)      •  h2  a    k 

al  1    ■  a2  12 

b^2)     =  2b1b2  +  b3     ;  b£2)       =  b2     +  b2b2  (15) 

where  the  original    b  coefficients  are  from  (10).     We  will    cover  the  process    in 
detail   a  little  later   in  this  section. 

Now  let  us   recall    what  x'    is    in   terms  of  the  original    IIR   filter   as   des- 
cribed  in  (7).     Substituting  x'    =  A(D)x   into  (13)  yields: 

y  =  [1  +  ajp)   D  +  ...   +  Jp)   np]  [aQ  +  axD  +  a2D2]x 

♦     b$Dp+1y  ♦  b^Dp+2y  (16) 

This  equation  when  multiplied  out  for  the  case  of  p=2  yields  (17) 

y=[a^+a!2>D  +  ...  +  af  D4]x 

+   [b<2)   D3  y  +  b|2)   D4  y]  (17a) 


and 


w'   =DV[af+afW...  +  a<2¥]x 


+  [b^2)D3  w'   +  b^2)D4  w']  (18a) 

These  two  equations  look  very  similar  now,  the  only  difference  being  that 
the  non-recursive  parts  of  (18a)  is  the  non-recursive  part  of  (17a)  delayed  by 
A  +  4.      If  both  sides  of  (17a)   are  delayed  by  that  amount,   (17b)   results: 

Di+4y  =  Di+4[a<2>+af>D  +  ...+a<2>D4]x 

+[b(2>Di+7y+b<2>D*+8y]  (17b) 


Finally,    it    should    be    noted    that    DA    y    in    (17b)     is    the    same    as    w'     in 
(18a).  substituting  DA+4y  =  w'    in  (17b)  produces  (17c): 


w'   =D*+4[a(2>+a{2>+...  +  afD4]x 


+  [b^2)D3  w'   +  b^2)D4  w']  (17c) 

which  is  the  same  as  (18a). 

What  this  all  means  is  that  the  structure  of  figure  6  will  produce  an  out- 
put w'  which  is  simply  the  desired  signal  y,  delayed  by  A+4  sampling  periods. 
This  digital  filter  structure  uses  pipeline  logic  units  in  both  the  recursive 
and  non-recursive  portions  to  realize  the  desired  output. 

The  design  still  needs  to  be  completed,  even  though  the  heart  of  the 
structure  has  been  developed.  Figure  7  shows  the  structure  of  the  non- 
recursive  portion  of  the  filter  to  complete  the  example. 

The  circuit  of  figure  7  produces  D4[ao  +  a^D  +...+  a4D4]x  which 
corresponds  to  the  DA[ao  +  a^D  +...+  a4Dd]x  term  required  as  the  out- 
put of  the  non-recursive  part  of  figure  7.   Therefore,  the  pipe!  ine  digital 

4+4 
filter  of  figure  7  combined  with  figure  6  produces  D   y,  that  is  y,  de- 
layed by  8  sample  periods. 

Now  we  return  to  the  calculation  of  the  ay'  and  bjp^ 
coefficients.  More  general  relations  than  (15)  can  be  developed  in  difference 
equation  form  with  respect  to  p. 

Starting  with  the  p  augmented  order  (p+2  order)  difference  equation  (13) 
we  can  derive  the  p+1  augmented  order  equation  by  substituting  for  Dp  y 
[(10)  delayed  by  Dp+1]  into  (13). 
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y  -  [1  +  a[p)  0  +  ...  +  a'p)  Dp]x' 

♦  b^}  [DP+1  »'  Hbx  D^2  y  ♦  b2  Op+3  y)] 


*  »$  °P+2  » 

y  -  [1  ♦  «{P>  D  ♦  ...   ♦  «(P>  Dp  ♦  b<P>  DP+1]x' 

♦  (b$  bj  +  b'P»)  OP"2  y  ♦  b<P]  b2  DP+3y  (19) 
From  (19)  we  can  see  that 

#«>."b(P>  (20.) 

P^1'  -  #1  bl  +  "»$  <20b> 

^'  "  «#}  b?  (2°0 

Where  0^=  1,  b{0)  =  b1§  and  b^  =  bg. 

Table  1  shows  the  values  of  the  coefficients  of  (13)   for  the  values  of  p  from  0 

t  h 

through    5    as    derived    from    (20).        Furthermore,    multiplication    of    the    p 
order  polynomial 

a<p)    +  aj^D1   +...+  aiP)DP 
U  1  p 

by  the  original 

2 

aQ  +  a^D  +  a2D 

yields  the  polynomial 

a<,P>  ♦  a{P»D    ♦...+  a<P»DP+2 
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as    shown     in     (16)     and     (18).        Table    2    shows     the    values     of     ay'     for     p 
from  0  through  5. 

The  unique  structure  of  this  real  ization,  which  differs  from  conventional 
realizations  of  digital  filters  is  contained  in  the  recursive  portion,  where 
the  basic  feedback  loop  involving  the  longest  delay  passes  through  p+2  pipeline 
stages,    for    the    realization    of   a    2n      order    filter.       In    the    example    devel- 

■f"h 

oped,  p  was  2,  causing  us  to  use  a  4   order  pipeline  filter  represen- 
tat  ion  of  the  original  2   order  filter. 

In  general,  we  assume  we  have  ka  stage  pipeline  add  units  and  km  stage 
pipeline  multipliers.  The  structures  of  the  ral  ization  of  the  feedback  portion 
of  the  filter  would  be  as  shown  in  figure  8.  Since  the  maximum  delay  around 
the  loop  must  match  the  p  augmented  difference  equation  order,  we  have 

p  =  2ka  +  km-2  (21) 

The  FIR  portion  of  the  filter  is  shown  in  general  terms  in  figure  14. 
From  this  figure,  it  should  be  clear  that 


*-  |i°92(ka)lk«*k.*k- 
and  that  the  over  all  delay  is  therefore 


A  +  2k 
a 


=   pog2(ka)]ka  +  3ka  +  km  (22) 


STABILITY   ISSUES 
In  the  previous  section  we  have  seen  the  general    process  of  realization  of 
a  digital    filter  transfer  function  using  pipel  inemultiply-add   units  with   delay 
2.      We   have  also   shown   a    realization   using  2-stage  multiply-add    units.      The 


essence  of  the  techique    is   to   represent   a   second-order  section   by  means    of   a 

t  h 

(2+p)        order    section    of    a    particular    form,    where    p    is    determined    by    the 
number  of  stages   in  the  pipeline  multiply  and  add  units. 
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We  know  from  other  considerations  that  the  transfer  function  of  the  higher 
order  real  ization  must  in  fact  be  the  same  as  the  original  and  therefore  must 
have  the  same  poles  and  zeros  in  the  z  plane.  Thus,  the  operation  which  trans- 
forms (7)    into  (17)   can  be  thought  of  in  the  following  way: 


(aQ  +  a^"1   +  a2z"2)    (1  +  c^z"1   +  ...  ♦  c£p)z"p) 
(1   -  b^"1   -  b2z*)      (1   +  ^T1   +  ...   +  a^^z'P) 


3(P)  +  a(P)  z"l  +  +  a(P)  z"P-? 

ao    +  ai     z     *  •••  +  V? z 

1         p+1   z  Dp+2   z 


(24) 


Where  the  coefficients  of  the  a  polynomial  depend  on  the  original  bj  and 
b£  of  (5)  and  (7).  These  a  coefficients  are  governed  by  (20)  and  some  were 
tabulated  in  Table  lb.  Thus,  we  raise  the  order  of  the  denominator,  introduc- 
ing p  new  poles  and  corresponding  cancelling  zeros.  These  new  poles  correspond 
to  the  roots  of 

0   =  1+  ajp)    D  +  ...   +  a£p)   DP  (25) 

The  roots  of  a(D)  of  course  are  the  poles  in  the  z"  plane  and  are 
the  reciprocal  of  the  poles  in  z  plane. 

One  concern  in  the  realization  is  that  the  filter  be  stable,  that  is,  that 
it  have  an  impulse  response  which  decays  to  zero.  A  sufficent  condition  for 
the  stability  of  a  digital  filter  with  transfer  function  H(z"  )  is  that 
the  poles  of  H(z"  )  in  the  z"  plane  lie  outside  the  unit  circle, 
and  that  the  order  of  the  numerator  (-m)  be  less  than  or  equal  in  magnitude  to 
the  order  (-n)  of  the  denominator.  A  more  familiar  sufficient  condition  is 
that  the  poles  of  the  transfer  function  of  z  (H(z))  be  inside  the  unit  circle. 
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Now,  if  we  start  with  a  stable  filter,  we  would  like  to  be  assured  that 
the  poles  of  the  augmented  order  filter  all  be  outside  the  unit  circle  in  the 
z  plane.  We  desire  this  because,  even  though  the  added  poles  are 
cancelled  by  the  zeros  of  (1  +  crp'z~  +  ...  +  crp'z~p),  realization  imperfec- 
tions will  prevent  exact  cancellation  and  the  augmented  filter  would  be 
unstable. 

In  order  to  examine  the  stability  question,  we  will  make  use  of  Jury's 
stability  test  [6]  as  applied  to  the  successively  higher  degree  denominator 
polynomials  in  z  as  p  increases. 

We  start  with  the  denominator  of  (5),  written  as  a  polynomial    in  z 

P(z)  =  z2  -  bjZ  -  b2  (26) 

To  insure  that  the  original  poles  of  (5)  are  inside  the  unit  circle  and 
hence  that  the  original  filter  is  stable,  we  apply  Jury's  test  with  necessary 
conditions: 

P(l)  =  1   -  bl  -  b2  >  0 

(-l)nP(-l)  =  1  +  bx  -  b2  >  0 

and  sufficient  condition:  \  (27) 

|  b2  |  <  1 

Hence,  for  this  second-order  digital  filter  to  be  stable,  the  values  of  b]_ 
and  b2  must  be  in  the  shaded  area  as  shown  in  Fig.  10.  Note  that  this  is  the 
case  when  p  =  0,   i.e.,  no  augmentation. 

Now,  assuming  that  the  original  filter  transfer  function  (5)  is  stable, 
i.e.,    (27)    is    satisfied,    it    is    desirable    to    determine    conditions    to    assure 
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stability  of  the  new  p  poles  introduced  in  the  system  when  the  transfer 
function  is  p-augmented.  For  p  =  1,  the  new  polynomial  introduced  in  the 
numerator  and  denominator  is 

F(z)   =  z  +  bj  (28) 

Applying  Jury's  test,  conditions  for  stability  are 
F(l)   =  1  +  b,   >  0 

(-1)"F(-1)   =  1   -  bj   >  0 

This  condition,  |  b^  |  <  1,  is  shown  graphically  in  figure  11.  Similarly,  for 
p  =  2,  the  added  polynomial    is 

F(z)  =  z2  +  bxz  +  (b2  +  b2)  (30) 

Jury's  test  gives  conditions  for  stable  roots.     The  necessary  conditions  are 

F(l)   =  1  +  bl  +  (b2  +  b2)  >  0 

(-l)nF(-l)  =  1  -  bj  +  (b2  +  b?)   >  0        J 
and  the  sufficient  condition  is 

|  b2  +  b2  |  <  1 

The  region  defined  by  conditions  (31)  is  shown  graphically  in  figure  12 
Finally,  for  p  =  3 

F(z)  =  z3  +  bxz2  +  (b2  +  b2)z  +  (b3  +  2b1b2) 
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and  the  conditions  for  stable  poles  are  (figure  13): 

F(l)  =  1  +  bx  +  {b\  +  b2)  +  (b\  +  2bxb2)  >  0 

(-l)nF(-l)  =  1  -  bj  +  {b\  +  b2)  -  (bj  +  2bxb2)  >  0  i  (32) 

|  bj  +  2bxb2  |    <  1 

|  [b\  +  2b1b2)2  -  1  |    >  |  b1(bj  +  2bxb2)  -  (bj  +  b2)  | 

This  procedure  can  be  continued  for  higher  values  of  p;  in  fact  general 
stability  tests  based  on  Jury's  test  haye  been  developed  [61.  Unfortunately, 
those  tests  are  cumbersome  to  apply. 

Instead,  let  us  take  another  approach. 

From  figures  10-13  it  may  be  observed  that  as  the  value  of  p  increases, 
the  area  of  stability  (the  shaded  areas  in  the  figures)  tends  to  increase.  It 
was  seen  graphically  that  as  p  increased,  the  stable  region  came  to  be  closer 
and  closer  to  the  original  shaded  area  for  p  =  0,  i.e.,  figure  10.  This  leads 
to  the  obvious  conjecture  that  all  stable  filters  have  a  stable  augmented 
filter  for  some  large  enough  p. 

We  will  follow  the  procedure  suggested  by  Voelcker  [7]  for  Block  Filters 
to  show  that  for  a  large  value  of  augmentation,  p,  the  pipelined  version  of  the 
original  filter  will  always  be  stable,  provided  that  the  original  second  order 
filter  is  stable.  This  is  a  yery  important  and  useful  conclusion  and  implies 
that  an  augmented  stable  realization  is  always  possible  if  the  original  trans- 
fer function   is  stable.     The  proof  for  this  fact  follows: 

Since  the  process  of  augmentation  implies  multiplying  the  numerator  and 
the  denominator  by  the  same  polynomial,  the  original  transfer  function  may  be 
equated  to  the  augmented  transfer  function  as 
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(Z)(«W  ♦  «j»,V1  +  ...  +  aJP)z"P) 


N(z)       .  "^'^0   r  gl  *   -  -.  -  «p 


tt — r~yi 


1  -  blx-  -  b2z-     l-z-'^'^^b^z-1) 

where  N(z)  is  the  numerator  of  the  original  transfer  function.  Henoting  the 
denominator  of  the  original  transfer  function  as  0(z),  since  this  original  fil- 
ter is  stable,  the  roots  of  D(z)  are  inside  the  unit  circle.  Figure  14(a) 
shows  the  original  filter  where  Hp(z)  is  a  pole  only  function  representing 
the  denominator,  (1/D(z))  of  the  original  filter. 

The  process  of  augmenting  the  difference  equation  suggests  building  the 
filter  as  shown  in  figure  14(b)  where 


„(Z)  =  a<p)  +  a'P'r1  +  ...  ♦  a'P»z-p 


and     S(z)  -  b«P>  ♦  bjj'z-1 


1      «(z) 


Thus     Hp(z)  -  ^  -  _-^T1_  (33, 

Hence,  the  nonrecursive  transfer  function  o(z)  may  be  written  as 

a(z)  -  Hp(z)  -  z"<ptl>  §f|}  (34) 

Since  H  (z)    1S  a  P°^e  only  transfer  function 
P 


H.(Z)   •  -j-i 


mA 


i=l 


2        a 
i=l  1  -  -4 


z 
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Taking  the  inverse  transform 

hp(n)  -  j^   a^ 
i=l 

Since  z  ana"  z  are  the  initial  poles,  they  are  inside  the  unit  cir- 
cle. Hence  h  (n)  1S  an  infinite  sequence  decreasing  in  magnitude.  If  the 
sequence  is  truncated  at  the  (p  +  1)  term,  the  remaining  portion  of  the 
sequence 


fy")  - 


i=l 


n  =  0,  1,  .. . ,  p 


otherwise 


Taking  the  z-transform 


/\ 


O  /.N.-n 


Hn(z)  =  V   hp(n)z 


n=0 


P         2 

£  Eai2"z 

n=0     1-1 


X  P  /Zi 


2 

£a 

i=l 


[l  -  (Zi/z)^1 

TL  i  -  ivz)  . 


^  ai  V-   ai(zi/z) 

^  i  -   U,-/z)  "  2^    1  -  (z./z) 
i-1  1  i=l  7 
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Thus, 

2      a,z  p+1 


1=1  ] 


Comparing  (34)  and  (35)  and,  since  H  (z)  is  a  finite  impulse  response 
transfer  function,  equating  H  (z)   and  a(z) 

2     a  z  p+1 
G(z)   _  ^  aiZi 

TTITT-  Z,  1   -  z./z 

i=l  1 

Thus 

2     a.z.p+1D(z) 

G(Z)  =  £  l-(zJz)  (36) 

i=l  ] 

The  intent  of  this  procedure  is  to  prove  that,  for  some  high  value  of  p,  all 
roots  of  a(z)   are   inside  the  unit  circle  in  the  z-domain.     From  (33) 

D(z)a(z)   =  1   -  z~(p+1)G(z)  (37) 

Since  D(z)  is  the  denominator  of  the  original  transfer  function,  all  roots  of 
D(z)  are  inside  the  unit  circle.  Hence  all  roots  of  o(z)  are  also  inside  if, 
and  only  if,  all  roots  of  the  right  hand  expression  of  (37)  are  inside  the  unit 
circle.  To  do  this,  let  tilde  (~)  denote  the  result  of  the  mapping 
z~  -»  z,  i.e.,  T(z)  =  f(z~  )  and  A(z)  is  defined  as 

A(z)  =  1  -  z(p+1)G(z)  =  D(z)a(z)  (38) 

All  roots  of  D(z)  are  exterior  (outside  the  unit  circle).  Hence,  all  roots  of 
a(z)  are  exterior  if  all  roots  of  A(z)  are  exterior.  To  prove  this,  Rouche's 
theorem  is  used  which  states  that:  "If  f(z)  and  g(z)  are  analytic  inside  and 
on  a  closed  contour  C,   and  |  g(z)  |  <  |  f(z)  |  on  C,   then  f(z)   and  f(z)  +  g(z)   have 
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the  same  number  of  zeroes  inside  C. "  Clearly,  A  in  (38)  is  analytic  within  and 
on  the  unit  circle.  The  constant  "1"  has  no  interior  zeroes.  Hence  A(z)  will 
have  no  interior  zeroes  if,  on  the  unit  circle, 

|  z(p+1)G(z)  |  <  1   ,   z  =  ej*  (39) 

From  (4.20), 

2  a  z(p+1)D(z) 
|z(P+l)G(z)|      ,  ei(P+D'  f  31Zi     2)  (40) 

z=e  £{  1  "  zie 

But  |  z-f  |  <  1  for  all  i  because  these  are  poles  of  the  original  filter.  Thus, 
for  some  value  of  p  greater  than  some  critical  value,  equation  (40)  will  satis- 
fied. Note  that  (40)  may  be  satisfied  for  smaller  value  of  p  also. 

From  the  above  discussion,  it  may  be  concluded  that  for  some  high  value  of 
augmentation,  it  is  always  possible  to  ensure  that  a(z)  has  roots  inside  the 
unit  circle.  Since  these  roots  are  the  new  additional  poles  of  the  augmented 
system,  the  new  high  order  transfer  function  would  be  stable. 

As  a  result  of  this  fact,  the  following  design  method  is  suggested. 

We  assume  that  the  designer  is  given  a  stable  recursive  digital  filter 
transfer  function  H(z)  and  a  desired  sampling  (clock)  rate  of  operation,  fs. 

1.  Find  km  stage  pipeline  multiply  and  ka  stage  pipeline  add  units 
that  will  operate  at  clock  frequency  fs. 

2.  Factor  H(z)  into  I    {i    -  1  if  n  is  odd)  2nd  order  (and  if  n  is 

st 
odd,  at  most  one  1   order)  transfer  functions. 

H(z)  =  H:(z)  .  H2(z)  ...H£(z) 

where  I   =  fn/2] 

3.  For  each  H^z),   i  =  1,  2,    ...   £.     Realize  Hi(z)   as  follows: 
a.     Set  p  =  2ka  +  km  -  2, 
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b.  Determine  the  polynomial   1   -  bij_iz"p~     -  Di+2Z"         =  Rn^z~  ^' 

c.  Solve  for  the  roots  of  B  (z"1)  =0  (41) 

d.  If      all      roots      of     B  (z~  )  =  0      are      outside      the      unit      cir- 

P 

cle,  go  to  e,  else  set  p  =  p  +  1  and  return  to  step  3b. 

e.  H-j(z)  can  be  real  ized  by  a  stable  p  augmented  order  difference 
equation 


a 


^+a{P>z-1  +  ...+a^z-P-2 


u  ,  -1,   a0     al  *     '••   ap+2 

Hi(z  }  =    —   b(p)z-P-l  ■  btP)z-P-2  i?A) 

1   Dp+lz      Dp+2Z 

where  p  2  2ka  +  km  -  2 

Because  the  last  result,  we  know  that  we  can  always  find  a  large 
enough  p  so  that  (24)  will  be  stable. 

In  the  next  section,  we  will  illustrate  the  method  with  an  example  filter 
taken  from  the  literature. 

EXAMPLE 
We  will  treat  the  Butterworth  filter  considered  by  Oppenheim  and  Shafer 
[81.   This  is  a  sixth  order  filter  whose  original  transfer  function  H(s)  is 
given  by 

H(.).-j °^38 (42) 

( s^+0.396s+0. 5871 ) ( s^+1 . 083s+0. 5871 ) ( s^+1 . 4802s+0. 5871 ) 

Applying    the    bi-linear    transformation,    a    sixth    order    difference    equation    is 
obtained  that  can  be  factored   in  three  second  order  filters. 

H(z)   =  Hx(z)   •   H2(z)    •  H3(z)  (43) 
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where 

H, (z)  =     .0007378   (1  *  lzl  *  z1)  (44a) 

1  (1  -  1.2686  z"1  +  0.7051  z'L) 

H,(z)   = Q  +  2z"l  +  Z"2) r-  (44b) 

L  (1  -1.0106  z"1  +  0.3583  z"c) 

H3(z)  = (1  +  2Z'\+  Z"2) r-  (44c) 

J  (1  -  0.9044  z"1  +  0.2155  z_<i) 

The  poles   in  the  z~  plane  of  these  equations  are  as  follows: 

Px  =  0.8996  ±  j.7804  (45a) 

|  Px  |  =  1.1909 
P2  =  1.4103  ±  j.8956  (45b) 

|  P2  |  =  1.6706 
P3  =  2.0984  ±  j.4870  (45c) 


P3  |  =  2.1542 

Thus  all  three  of  the  factoring  second  order  transfer  functions  are 
stable. 

p  augmented  order  difference  equations  of  the  form  of  (16)  were  derived 
for  each  of  the  three  sections,  for  values  of  p  from  1  through  5.  H^  and 
H2  were  unstable  for  p=l  and  stable  for  all  values  of  p  from  2  through  5. 
H3  was  stable  for  all  values  of  p  from  1  through  5. 

If  we  had  a  2  stage  adder  and  a  2  stage  multiplier,  (21)  would  yield  a  p 
of  4,  which  for  our  example  adds  stable  poles  to  each  section.  The  general 
structure  of  each  of  the  2nd  order  (4-augmented)  sections  would  be  as  shown  in 
figure  15. 


22 


Finally,  the  values   for  the  coefficient  of  each  of  the  multipliers   would 

be  as  shown   in  table  3. 

(41 
Note    that    the    av    '    coefficient    for    the    first    section    are    all     on    the 

order  of  .001  in  magnitude.  These  coefficient  could  easily  be  computed  to  more 
significant  figures,  although  realization  of  more  significant  figures  in  the 
coefficient  of  an  actual  filter  would  require  either  scaling  or  the  use  of  a 
floating  point  number  system. 

This  example  shows  how  an  IIR  filter  can  be  realized  using  high  rate  pipe- 
line modules,  with  the  ultimate  objective  of  achieving  a  higher  sample  rate 
than  possible  with  non-pipelined  multiplier  and  adders. 
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SUMMARY  AND  CONCLUSIONS 

In  this  paper  we  have  developed  a  method  for  applying  pipeline  techniques 
to  the  design  of  high  speed  recursive  digital  filters.  Using  these  techniques, 
recursive  digital  filters  operating  at  rates  hitherto  impossible  can  be 
designed.  The  general  structure  of  the  filter  and  the  method  for  calculating 
the  multiplier  coefficients  is  presented.  The  stability  of  the  resulting  real- 
ization has  been  investigated  and  a  technique  for  ascertaining  the  stability  of 
the  realization   is  presented. 

A  significant  example  taken  from  the  literature  illustrates  the  technique 
and  demonstrates  the  existance  of  a  stable,  high  rate  pipeline  realization  of  a 
practically  useful   filter. 
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Coeff  of  Dp+1 

Coeff  of  Dp+2 

p 

P+l 

b(p) 
V2 

0 

bl 

b2 

1 

b?  +  b2 

blb2 

2 

b^  +  2bxb2 

2           .2 

b,  b~  +  b« 

3 

4           2             2 
b7  +  3b£b2  +  bc2 

4               2  2         3 
b^b2  +  3b^b2  +  b2 

4 

b\  +  4b^b2  +  3b:b^ 

4                2   2          3 
b^b2  +  3b^b2  +  b2 

5 

bl+5blb2+6blb2+b2 

b^b2+4b^b2+3b1b2 

Table  la.     b's  for  p  =   0,   1,    ...5. 


p 

a0 

4P) 

>> 

a2 

a3 

> 

a4 

4P) 

0 

0 

0 

0 

0 

0 

1 

bi 

0 

0 

0 

0 

2 

b. 

b?  +  b2 

0 

0 

0 

3 

bl 

bl  +  b2 

b^  +  2bxb2 

0 

0 

4 

b. 

bl+b2 

b^  +  2bxb2 

4            2              2 
b*  +  3b^b2  +  b2 

0 

5 

bl 

bl  +  b2 

b^  +  2bxb2 

4            2              2 
b*  +  3b^b2  +  b2 

bl  +  4blb2  +  3blb2 

Table  lb.     a's  for  p  =  0,  1,    ...   5. 
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Figure  4a.     One-stage  pipeline  mul tiply-add  unit 


Figure  4b„     One-stage  pipeline  multiply   feeding  one-stage  add 
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Figure  5.       ka  +  km  stage  pipeline  multi ply-add  module 
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2 

Polynomial2  F{z)-z  -b  z-b  =0 

Conditions  for  Stability' 

l-b,-b2>0 © 

l  +  brb2>0 © 

b2|<l  © 


Figure  10.     Condition   for  stability  of  the  original    filter 
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Polynomial*  F(z)=  z+b  =0 
Conditions  for  Stability5 


1  +  b^O 
1  -b^O 


+  b 


Figure  11.  Condition  for  stability  of  the  augmented  filter:  p=l 
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2  2 

Polynomial:  F(z)  =  z  +b1z  +  (b1  +  b2) 

Conditions  for  Stability5 

1  +  b  +(b*+b  )>0  © 

1  +  b+(b^b     >0 © 

|b2+bl<l  


Figure  12.     Condition   for  stability  of  the  augmented   filter:     p=2 
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Polynomial:    F(z)=  23+b122+(o^+b2)2+(b^+2b1b2) 
Conditions  for  Stability: 

l+b1+(b^+b2)+(b^+2b1b2)  >  0  © 

l-D1  +  (b^b2)-(b^2b]b2)  >  0  © 

|b^2b1b2]<  1        © 


(b^2b1b2)2-Tj    >    lb1  (b3+2b1b2)-(b2+b2)|  ..© 


Fig.    13       Stability  of  augmented  filter:    p=3 
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Figure  14.   (a)  Original  transfer  function  (b)  Equivalent  augmented  transfer 
function 
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